gaussian noise model
Robust Neural Network Regression for Offline and Online Learning
We replace the commonly used Gaussian noise model in nonlinear regression by a more flexible noise model based on the Student-t(cid:173) distribution. The degrees of freedom of the t-distribution can be chosen such that as special cases either the Gaussian distribution or the Cauchy distribution are realized. The latter is commonly used in robust regres(cid:173) sion. Since the t-distribution can be interpreted as being an infinite mix(cid:173) ture of Gaussians, parameters and hyperparameters such as the degrees of freedom of the t-distribution can be learned from the data based on an EM-learning algorithm. We show that modeling using the t-distribution leads to improved predictors on real world data sets.
Bayesian Quantile Matching Estimation
Nirwan, Rajbir-Singh, Bertschinger, Nils
Due to data protection laws sensitive personal data cannot be released or shared among businesses as well as scientific institutions. While anonymization techniques are becoming increasingly popular, they often raise security concerns and have been re-identified in some cases Narayanan and Shmatikov (2010). To be on the safe side, big data collecting organisation such as Eurostat (statistical office of the European Union) or the World Bank only release aggregated summaries of their data. E.g.: Instead of individual salary data only selected quantiles of the population distribution are available. Thus, for exploratory analysis as well as statistical modeling, the need for methods which work on aggregated data is there.
On denoising modulo 1 samples of a function
Cucuringu, Mihai, Tyagi, Hemant
Consider an unknown smooth function $f: [0,1] \rightarrow \mathbb{R}$, and say we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + \eta_i)\mod 1$ for $x_i \in [0,1]$, where $\eta_i$ denotes noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) \bmod 1$. We formulate a natural approach for solving this problem which works with angular embeddings of the noisy mod 1 samples over the unit complex circle, inspired by the angular synchronization framework. Our approach amounts to solving a quadratically constrained quadratic program (QCQP) which is NP-hard in its basic form, and therefore we consider its relaxation which is a trust region sub-problem and hence solvable efficiently. We demonstrate its robustness to noise via extensive numerical simulations on several synthetic examples, along with a detailed theoretical analysis. To the best of our knowledge, we provide the first algorithm for denoising mod 1 samples of a smooth function, which comes with robustness guarantees.
Provably robust estimation of modulo 1 samples of a smooth function with applications to phase unwrapping
Cucuringu, Mihai, Tyagi, Hemant
Consider an unknown smooth function $f: [0,1]^d \rightarrow \mathbb{R}$, and say we are given $n$ noisy mod 1 samples of $f$, i.e., $y_i = (f(x_i) + \eta_i)\mod 1$, for $x_i \in [0,1]^d$, where $\eta_i$ denotes the noise. Given the samples $(x_i,y_i)_{i=1}^{n}$, our goal is to recover smooth, robust estimates of the clean samples $f(x_i) \bmod 1$. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem -- a quadratically constrained quadratic program (QCQP) -- where the variables are constrained to lie on the unit circle. Our approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, and random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our approach via extensive numerical simulations on synthetic data, along with a simple least-squares solution for the unwrapping stage, that recovers the original samples of $f$ (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo $1$ samples. Finally, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in radar interferometry, we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop.